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1. Preamble 



In these lecture notes I will give a pedagogical introduction to some common 
aspects of 4 different problems: (i) random matrices (ii) the longest increasing 
subsequence problem (also known as the Ulam problem) (iii) directed polymers 
in random medium and growth models in (1 + 1) dimensions and (iv) a problem 
on the alignment of a pair of random sequences. Each of these problems is almost 
entirely a sub-field by itself and here I will discuss only some specific aspects of 
each of them. These 4 problems have been studied almost independently for 
the past few decades, but only over the last few years a common thread was 
found to link all of them. In particular all of them share one common limiting 
probability distribution known as the Tracy-Widom distribution that describes the 
asymptotic probability distribution of the largest eigenvalue of a random matrix. I 
will mention here, without mathematical derivation, some of the beautiful results 
discovered in the past few years. Then, I will consider two specific models (a) a 
ballistic deposition growth model and (b) a model of sequence alignment known 
as the Bernoulli matching model and discuss, in some detail, how one derives 
exactly the Tracy-Widom law in these models. The emphasis of these lectures 
would be on how to map one model to another. Some open problems will be 
discussed at the end. 



2. Introduction 

In these lectures I will discuss 4 seemingly unrelated problems: (i) random ma- 
trices (ii) the longest increasing subsequence (LIS) problem (also known as the 
Ulam problem after its discoverer) (iii) directed polymers in random environment 
in (1 + 1) dimensions and related random growth models and (iv) the longest 
common subsequence (LCS) problem arising in matching of a pair of random 
sequences (see Fig. Q]). These 4 problems have been studied extensively, but al- 
most independently, over the past few decades. For example, random matrices 
have been extensively studied by nuclear physicists, mathematicians and statisti- 
cians. The LIS problem has been studied extensively by probabilists. The models 
of directed polymers in random medium and the related growth models have been 
a very popular subject among statistical physicists. Similarly, the LCS problem 
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Fig. 1 . All 4 problems share the Tracy-Widom distribution. 



has been very popular among biologists and computer scientists. Only, in the last 
10 years or so, it became progressively evident that there are profound links be- 
tween these 4 problems. All of them share one common probability distribution 
function which is called the Tracy-Widom distribution. 

This distribution was first discovered in the context of random matrices by 
Tracy and Widom [1]. They calculated exactly the probability distribution of the 
typical fluctuations of the largest eigenvalue of a random matrix around its mean. 
This distribution, suitably scaled, is known as the Tracy-Widom (TW) distribu- 
tion (see later for details). Later in 1999, in a landmark paper [2], Baik, Deift 
and Johansson (BDJ) showed that the same TW distribution describes the scaled 
distributions of the length of the longest increasing subsequence in the LIS prob- 
lem. Immediately after, Johansson [3], Baik and Rains [4] showed that the same 
distribution also appears in a class of directed polymer problems. Around the 
same time, Prahofer and Spohn showed [5] that the TW distribution also appears 
in a class of random growth models known as the polynuclear growth (PNG) 
models. Following this, it was discovered that the TW distribution also occurred 
in several other growth models, such as the 'oriented digital boiling' model [6], 
a ballistic deposition model [7], in PNG type of growth models with varying ini- 
tial conditions and in various geometries [8,9] and also in the single-step growth 
model arising from the totally asymmetric exclusion process [10]. Also, a some- 
what direct connection between the stochastic growth models and the random 
matrix models via the so called 'determinantal point processes' was found in a 
series of work by Spohn and collaborators [11], which I will not discuss here 
(see Ref. [11] for a recent review). Finally, the TW distribution was also shown 
to appear in the LCS problem [12], which is also related to these growth mod- 
els. Apart from these 4 problems that we will focus here, the TW distribution 
has also appeared in many other problems, e.g., in the mesoscopic fluctuations 
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Random Matrices, the Ulam Problem, Directed Polymers & Growth Models, and Sequence Matching 

of excitation gaps in a dirty metal grain or a semiconductor quantum dot induced 
by a nearby superconductor [13]. The TW distribution also appears in problems 
related to finance [14]. 

The appearence of the TW distribution in so many different problems is really 
interesting, suggesting an underlying universality that links all these different 
systems. The purpose of my lectures would be to explore and elucidate the links 
between the 4 problems stated above. The literature on this subject is huge. I 
will not try to provide any detailed derivation of the mathematical results here. 
Instead, I will state precisely the known results that we will need to use and put 
more emphasis on how one maps one problem to the other. In particular, I will 
discuss two problems in some detail and show how the TW distribution appears in 
them. These two problems are: (i) a random growth model in (1 + 1) dimensions 
that we call the anisotropic ballistic deposition model and (ii) a particular variant 
of the LCS problem known as the Bernoulli matching (BM) model. In the former 
case, I will show how to the map the ballistic deposition model to the LIS problem 
and subsequently use the BDJ results. In the second case, I will show that the BM 
model can be mapped to a particular directed polymer model that was studied by 
Johansson. The mappings are often geometric in nature, are nontrivial and serves 
two purposes: (a) to elucidate how the TW distribution appears in somewhat 
unrelated problems and (b) to derive exact analytical results in problems such as 
the sequence matching models, where precise analytical results were missing so 
far. 

The lecture notes are organized as follows. In Section 3, 1 will describe some 
basic results of the random matrix theory and define the TW distribution pre- 
cisely. In Section 4, the LIS problem will be described along with the main 
results of BDJ. Section 5 contains a discussion of the directed polymer problems, 
and in particular the main results of Johansson will be mentioned. In Section 5.1, 
I will describe how one maps the anisotropic ballistic deposition model to the 
LIS problem. Section 6 contains a discussion of the LCS problem. Finally, I will 
conclude in Section 7 with a discussion and open problems. 



3. Random Matrices: the Tracy- Widom distribution for the largest eigen- 
value 

Studies of the statistics of the eigenvalues of random matrices have a long history 
going back to the seminal work of Wigner [15]. Since then, random matrices 
have found applications in multiple fields including nuclear physics, quantum 
chaos, disordered systems, string theory and number theory [16]. Three classes 
of matrices with Gaussian entries have played important roles [16]: (N x N) 
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real symmetric (Gaussian Orthogonal Ensemble (GOE)), (N x N) complex Her- 
mitian (Gaussian Unitary Ensemble (GUE)) and (27V x 27V) self-dual Hermi- 
tian matrices (Gaussian Symplectic Ensemble (GSE)). For example, in GOE, 
one considers an (N x N) real symmetric matrix X whose elements s are 
drawn independently from a Gaussian distribution: P(xu) = cxp[— xf,/2] 

and P(xij) = exp[— xfj] for i < j. Thus the joint distribution of all the 
N(N + l)/2 independent elements is just the product of the individual distri- 
butions and can be writen in a compact form as P[X] = An exp[— tr(X 2 )/2], 
where An is a normalization constant. One can similarly write down the joint 
distribution for the other two ensembles [16]. 

One of the key results in the random matrix theory is due to Wigner who de- 
rived, starting from the joint distribution of the matrix elements P(X), a rather 
compact expression for the joint probability density function (PDF) of the eigen- 
values of a random (TV x N) matrix from all ensembles [15] 



P(Xi, A 2 , . . . , Ajv) = B N exp 



I 
2 



N 



^-^lnflA.-A, 

=1 i±3 



, (3.1) 



where B>n normalizes the pdf and (3 = 1, 2 and 4 correspond respectively to the 
GOE, GUE and GSE. The joint law allows one to interpret the eigenvalues as the 
positions of charged particles, repelling each other via a 2-d Coulomb potential 
(logarithmic); they are confined on a 1-d line and each is subject to an external 
harmonic potential. The parameter (3 that characterizes the type of ensemble can 
be interpreted as the inverse temperature. 

Once the joint pdf is known explicitly, other statistical properties of a random 
matrix can, in principle, be derived from this joint pdf. In practice, however this is 
often a technically daunting task. For example, suppose we want to compute the 
average density of states of the eigenvalues defined as p(X, N) — ^2f = i(S(X — 
Ai)) /N, which counts the average number of eigenvalues between A and A + d\ 
per unit length. The angled bracket () denotes an average over the joint pdf. It 
then follows that p(A, N) is simply the marginal of the joint pdf, i.e, we fix one 
of the eigenavlues (say the first one) at A and integrate the joint pdf over the rest 
of the (N — 1) variables. 



1 N oo N 

P(\N) = -Y,(S(X-X 1 ))= QdA i P(A,A 2 ,...,Aj V ). (3.2) 

i=l J -°°i=2 



Wigner was able to compute this marginal and this is one of the central results 
in the random matrix theory, known as the celebrated Wigner semi-circular law. 
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Fig. 2. The dashed line shows the semi-circular form of the average density of states. The largest 
eigenvalue is centered around its mean y/2N and fluctuates over a scale of width N^ 1 ^. The prob- 
ability of fluctuations on this scale is described by the Tracy-Widom distribution (shown schemati- 
cally). 



For large N and for any (3, 



P (X,N) 



Ntt 2 



2N 



1/2 



(3.3) 



Thus, on an average, the N eigenvalues lie within a finite interval — V2JV, y2N 
often referred to as the Wigner 'sea'. Within this sea, the average density of 
states has a semi-circular form (see Fig. |2]i that vanishes at the two edges — y/2N 
and y/2N. Note that since there are N eigenvalues distributed over the inter- 



val 



'2N, V2N 



the average spacing between adjacent eigenvalues scales as 

JV-l/2. 

From the semi-circular law, it is clear that the average of the maximum (or 
minimum) eigenvalue is \/2N ( —y/2N] . However, for finite but large N, the 



maximum eigenvalue fluctuates, around its mean \2N, from one sample to an- 
other. A natural question is: what is the full probability distribution of the largest 
eigenvalue A max ? Once again, this distribution can, in principle, be computed 
from the joint pdf in Eq. ( 13. 11 1. To see this, it is useful to consider the cumula- 
tive distribution of A max . Clearly, if A max < t, it necessarily means that all the 
eigenvalues are less than or equal to t. Thus, 



Prob [A max <t,N] 



/ l[dX i P(\ 1 ,X2,...,X N ) 

J — OC ■ i 



(3.4) 
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where the joint pdf is given in Eq. ( 13. 11 1. In practice, however, carrying out this 
multiple integration in closed form is very difficult. Relatively recently, Tracy 
and Widom [1] were able to find the limiting form of Prob [A max < t, N] for 
large N. They showed that the fluctuations of A max typically occur over a very 
narrow scale of width ~ TV -1 / 6 around its mean \/2N at the upper edge of the 
Wigner sea. It is useful to note that this scale ~ TV -1 / 6 of typical fluctuations of 
the largest eigenvalue is much bigger than the average spacing ~ A -1 / 2 between 
adjacent eigenvalues in the limit of large TV. 

More precisely, Tracy and Widom showed [1] that asymptotically for large N, 

the scaling variable £ = \/2 A 1 / 6 A max — \/2N has a limiting TV-independent 

probability distribution, Prob[£ < x] — Fp (x) whose form depends on the value 
of the parameter /3 = 1, 2 and 4 characterizing respectively the GOE, GUE and 
GSE. The function Fp(x) is called the Tracy-Widom (TW) distribution function. 
The function Fp(x), computed as a solution of a nonlinear Painleve differential 
equation [1], approaches to 1 as x — > oo and decays rapidly to zero as x — > — oo. 
For example, for (3 = 2, F^x) has the following tails [1], 

F 2 (x) -> 1 - O (exp[-4x 3/2 /3]) as x -> oo 

— * cxp[— |a;| 3 /12] as x — > — oo. (3.5) 

The probability density function fp (x) — dFp jdx thus has highly asymmetric 
tails. A graph of these functions for j3 = 1, 2 and 4 is shown in Fig. [3] A 
convenient way to express these typical fluctuations of A max around its mean 
\2N is to write, for large N, 

(3.6) 

where the random variable \ has the limiting A-independent distribution, Prob[x < 
x] = Fp(x). As mentioned in the introduction, amazingly this TW distribution 
function has since emerged in a growing variety of seemingly unrelated prob- 
lems, some of which I will discuss in the next sections. 




Large Deviations of A max : Before we end this section and proceed to the 
other problems, it is worth making the following remark. The Tracy-Widom 
distribution describes the probability of typical and small fluctuations of A max 
over a very narrow region of width ~ 0(A -1 / 6 ) around the mean (A max ) w 
\2N. A natural question is how to describe the probability of atypical and 
large fluctuations of X max around its mean, say over a wider region of width 
~ 0{N 1 / 2 )1 For example, what is the probability that all the eigenvalues of a 
random matrix are negative (or equivalently all are positive)? This is the same 
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Fig. 3. The probability density function fp(x) plotted as a function of x for /3 = 1, 2 and 4 
(reproduced from Ref . [ 1 ] ) . 



as the probability that A max < (or equivalently A m ; n > 0). Since (A max ) w 



V 2 AT, this requires the computation of the probability of an extremely rare event 

characterizing a large deviation of ~ — 0{N 1 / 2 ) to the left of the mean. This 
question naturally arises in any physical system where one is interested in the 
statistics of stationary points of a random landscape. For example, in disordered 
systems such as spin glasses one is interested in the stationary points (metastable 
states) of the free energy landscape. On the other hand, in structural glasses 
or supercooled liquids, one is interested in the stationary points of the potential 
energy landscape. In order to have a local minimum of the random landscape 
one needs to ensure that the eigenvalues of the associated Hessian matrix are all 
positive [17, 18]. A similar question recently came up in the context of random 
landscape models of anthropic principle based string theory [19,20] as well as 
in quantum cosmology [21]. Here one is interested in the statistical properties 
of vacua associated with a random multifield potential, e.g., how many minima 
are there in a random string landscape? These large deviations are also important 
in characterizing the large sample to sample fluctuations of the excitation gap in 
quantum dots connected to a superconductor [13]. 

The issue of large deviations of A max was addressed in Ref. [3] for a spe- 
cial class of matrices drawn from the Laguerre ensemble that corresponds to the 
eigenvalues of product matrices of the form W = X^X where X itself is a Gaus- 
sian matrix (real or complex). Adopting similar methods as in Ref. [3] one can 
prove that for Gaussian ensembles, the probability of large fluctuations to the left 
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of the mean 



2N behaves for large N as, 



Prob [Aj 



'in ax 



<t,N] 



exp ~/3N 2 $ 



) 



(3.7) 



where t ~ (^(iV 1 / 2 ) < \/27V is located deep inside the Wigner sea and $-(y) 
is a certain Ze/f large deviation function. On the other hand, for large fluctuations 
to the right of the mean V2N, 



for t ~ 0{N 1 / 2 ) > V2JV located outside the Wigner sea to its right and $+(y) 
is the right large deviation function. The problem then is to evaluate explicitly 
the left and the right large deviation functions explicitly. While, for the 

Laguerre ensemble, an explicit expression of $ + (y) was obtained in Ref. [3] and 
that of $(2/) recently in Ref. [22], similar expressions for the Gaussian ensemble 
were missing so far. 

Indeed, to calculate the probability that all eigenvalues are negative (or pos- 
itive) for Gaussian matrices, we need an explicit expression of $_(y) for the 
Gaussian ensemble. This is because, the probability that all eigenvalues are neg- 
ative is precisely the probability that A max < 0, and hence, from Eq. ( 13.7b 



The coefficient 9 — /3<l>_(v2) of the N 2 term inside the exponential term in 
Eq. ( 13.9b is of interest in string theory, and in Ref. [20], the authors provided an 
approximate estimate (for (3 = 1) of 6 » 1/4, along with numerical simulations. 
Recently, in collaboration with D.S. Dean, we were able to compute exactly an 
explicit expression [23] for the full left large deviation function <£>_ (y). I will not 
provide the derivation here, but the calculation of large deviations turns out to 
be somewhat simpler [23] than the calculation of the small deviations 'a la TW. 
One simply has to minimize the effective free energy of a Coulomb gas using the 
method of steepest descents and then analyze the resulting saddle point equation 
(which is an integral equation) [23]. This technique is quite useful, as it can be 
applied to other problems as well, such as the calculation of the average number 
of stationary points for a Gaussian random fields with N components in the large 
N limit [24, 25] and also the large deviation function associated with the largest 
eigenvalue of other types of matrices, such as the Wishart matrices [22]. In terms 
of the variable z = y — V2, the left large deviation function has the following 




(3.8) 




(3.9) 
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explicit expression [23] 



$_(2/ = z + V2) = — (3 + 2 In 2) + 72z 2 - 2z 4 (30z + 2z 3 ) v / 6 + z 2 

8 216 L 



27 (3 + ln(1296) - 4 In (-z + V6 



(3.10) 



In particular, the constant 6 is given exactly by 



6» = /3$(V2)=/3^ = (0.274653 ...)[3. (3.11) 

Another interesting point about the left large deviation function $_(?/) is the 
following. It describes the probability of large ~ 0(V~N) fluctuations to the 
left of the mean, i.e., when y = (V2N — A max )/v^V ~ 0(1). Now, if we 
take the y — > limit, then $_(y) should describe the small fluctuations to the 
left of the mean \/2N. In other words, we expect to recover the left tail of the 
TW distribution by taking the y — > limit in the left large deviation function. 
Indeed, as y — > 0, one finds from Eq. ( 13.101 ), that $_(y) ~ y 3 /6y/2. Putting this 
expression back in Eq. ( I3.7l i one gets 



Prob[A max < t, N] sa exp 



-AjV^iV 1 / 6 {t~V2N)\ 3 



(3.12) 



Given that \ = y/2 N 1 / 6 Ct — y /2Nj is the Tracy-Widom scaling variable, we 
find that the result in Eq. J3. 12b matches exactly with the left tail of the Tracy- 
Widom distribution for all (3. For example, for (3 — 2 one can easily verify this 
by comparing Eqs. ( 13.12b and ( 13.5b . This approach not only serves as a useful 
check that one has obtained the correct large deviation function $_(y), but also 
provides an alternative and simpler way to derive the asymptotics of the left tail 
of the TW distribution. A similar expression for the right large deviation function 
$+ (y) for the Gaussian ensemble is still missing and its computation remains an 
open problem. 

Although the Tracy-Widom distribution was originally derived as the limit- 
ing distribution of the largest eigenvalue of matrices whose elements are drawn 
from Gaussian distributions, it is now believed that the same limiting distribu- 
tion also holds for matrices drawn from a larger class of ensembles, e.g., when 
the entries are independent and identically distributed random variables drawn 
from an arbitrary distribution with all moments finite [26,27]. Recently, Biroli, 
Bouchaud and Potters [14] extended this result to power-law ensembles, where 
each entry of a random matrix is drawn independently from a power-law distri- 
bution [28,29]. They showed that as long as the fourth moment of this power-law 
distribution is finite, the suitably scaled A max is again TW distributed, but when 
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the fourth moment is infinite, A max has Frechet fluctuations [14]. It would be 
interesting to compute the probability of large deviations of A max for this power- 
law ensemble, as in the Gaussian case mentioned above. For example, what is 
the probability that all the eigenavlues of such random matrices (drawn from the 
power-law ensemble) are negative (or positive), i.e. A max < 0? This is an open 
question. 



4. The Longest Common Subsequence Problem (or the Ulam Problem) 

The longest common subsequence (LIS) problem was first stated by Ulam [30] 
in 1961, hence it is also called the Ulam's problem. Since then, a lot of re- 
search, mostly by probabilists, has been done on this problem (for a brief history 
of the problem, see the introduction in Ref. [2]). The problem can be stated 
very simply as follows. Consider a set of N distinct integers {1, 2,3,..., N}. 
Consider all AH possible permutations of this sequence. For any given per- 
mutation, let us find all possible increasing subsequences (terms of a subse- 
quence need not necessarily be consecutive elements) and from them find out 
the longest one. For example, take N = 10 and consider a particular permutation 
{8, 2, 7, 1, 3, 4, 10, 6, 9, 5}. From this sequence, one can form several increasing 
subsequences such as {8, 10}, {2, 3, 4, 10}, {1, 3, 4, 10} etc. The longest one 
of all such subsequences is either {1, 3, 4, 6, 9} as shown by the underscores or 
{2, 3, 4, 6, 9}. The length In of the LIS (in our example In — 5) is a random 
variable as it varies from one permutation to another. In the Ulam problem one 
considers all the AH permutations to be equally likely. Given this uniform mea- 
sure over the space of permutations, what is the statistics of the random variable 
In! 

Ulam found numerically that the average length (In) behaves asymptotically 
(In) ~ cVN for large N. Later this result was established rigorously by Ham- 
mersley [31] and the constant c = 2 was found by Vershik and Kerov [32]. 
Recently, in a seminal paper, Baik, Deift and Johansson (BDJ) [2] derived the 
full distribution of I n for large N. In particular, they showed that asymptotically 
for large N 

l N + 7V 1/6 x (4.1) 

where the random variable \ has a limiting iV-independent distribution, 

Prob(x < x) = F 2 (x) (4.2) 

where F% (x) is precisely the TW distribution for the largest eigenvalue of a ran- 
dom matrix drawn from the GUE (j3 — 2), as defined in Section 3. Note that the 
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Sequence: { 8, 3, 5, 1, 2, 6, 4, 7 } 




Fig. 4. The construction of piles according to the patience sorting game. The number of piles cor- 
responding to the sequence {8, 3, 5, 1, 2, 6, 4, 7} is 4, which is also the length of the LIS of this 
sequence. 



power of N in the correction term in Eq. d4.il ) is +1/6 as opposed to the asymp- 
totic law in Eq. (13.61 ) where the power of N in the correction term is —1/6. This 
means that while for random matrices of size (N x N), the typical fluctuation of 
Amax around its mean value \^2N decreases with N as TV" 1 / 6 as N — > oo (i.e., 
the distribution gets narrower ans narrower around the mean as N increases), the 
opposite happens in the Ulam problem: the typical fluctuation in 1^ around its 
mean 2%/iV increases as N 1 / 6 with increasing N, i.e., the distribution around the 
mean gets broader and broader with increasing N. 

BDJ also showed that when the sequence length N itself is a random variable 
drawn from a Poisson distribution with mean (N) = A, the length of the LIS 
converges for large A to 

l x ^2V\ + X 1/6 X , (4.3) 

where \ nas the Tracy- Widom distribution Fi{x). The fixed N and the fixed A 
ensembles are like the canonical and the grand canonical ensembles in statistical 
mechanics. The BDJ results led to an avalanche of subsequent mathematical 
works [33]. 

I will not provide here the derivation of the BDJ results, but I will assume this 
result to be known and use it later for other problems. As we will see later, in 
many problems such as in several growth models, the stratgey is to map those 
models into the LIS problem and subsequently use the BDJ results. In these 
mappings, typically the height of a growing surface in the (1 + 1) dimensional 
growth models gets mapped to the length of the LIS, i.e., schematically, H — > In- 
Subsequently, using the BDJ results for the distribution of Ijy, one shows that the 
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height in growth models is distributed accoriding to the Tracy-Widom law. I 
will show explicitly how this strategy works for one specific ballistic deposition 
model in Section 5.1. But to understand the mapping, we need to know one 
additional fact about the LIS, which I discuss below. 

Suppose we are given a specific permutation of N integers. What is a simple 
algorithm to find the length of the LIS of this permuation? The most famous 
algorithm goes by the name of Robinson-Schensted-Knuth (RSK) algorithm [34] , 
which makes a correspondence between the permutation and a Young tableaux, 
and has played a very important role in the development of the LIS problem. But 
let me not discuss this here, the reader can find a nice readable account in Ref. 
[33]. Instead, I will discuss another related algorithm known as the 'patience- 
sorting' algorithm which will be more useful for our purposes. This algorithm 
was developed first by Mallows [35] who showed its connection to the Young 
tableaux. I will discuss here the version that was discussed recently by Aldous 
and Diaconis [33]. This algorithm is best explained in terms of an example. Let 
us take N = 8 and consider a specific permuation, say {8, 3, 5, 1, 2, 6, 4, 7}. The 
'patience sorting' is a greedy algorithm that will easily find the length of the LIS 
of this sequence. It is like a simple card game of 'patience'. This game goes as 
follows: start forming piles with the numbers in the permuted sequence starting 
with the first element which is 8 in our example. So, the number 8 forms the base 
of the first pile (see Fig. 0). The next element, if less than 8, goes on top of 8. If 
not, it forms the base of a new pile. One follows a greedy algorithm: for any new 
element of the sequence, check all the top numbers on the existing piles starting 
from the first pile and if the new number is less than the top number of an already 
existing pile, it goes on top of that pile. If the new number is larger than all the top 
numbers of the existing piles, this new number forms the base of a new pile. Thus 
in our example, we form 4 distinct piles: [{8, 3, 1}, {5, 2}, {6, 4}, {7}]. Thus the 
number of piles is 4. On the other hand, for this particular example, it is easy to 
check that there are 3 LIS's namely, {3, 5, 6, 7}, {1, 2, 6, 7} and {1, 2, 4, 7}, all 
of the same length I = 4. So, we see that the length of the LIS is 4, same as the 
number of piles in the patience sorting game. But this is not an accident. One can 
easily prove [33] that for any given permutation of N integers, the length of the 
LIS I n is exactly the same as the number of piles in the corresponding 'patience 
sorting' algorithm. We will see later that this fact does indeed play a crucial role 
in our mapping of growth models to the LIS problem. 

5. Directed Polymers and Growth Models 

The problem of directed polymers in random medium has been an active area 
of research in statistical physics for the past three decades. Apart from the fact 
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that it is a simple 'toy' model of disordered systems, the directed polymer prob- 
lem has important links to a wide variety of other problems in physics, such as 
interface fluctuations and pinning [36], growing interface models of the Kardar- 
Parisi-Zhang (KPZ) variety [37], randomly forced Burger's equation in fluid dy- 
namics [38], spin glasses [39-41], and also to a single-particle quantum mechan- 
ics problem in a time-dependent random potential [42]. There are many inter- 
esting issues associated with the directed polymer problem, such as the phase- 
transition at a finite temperature in (d + 1) -dimensional directer polymer when 
d > 2 [43], the nature of the low temperature phase [40,41], the nature of the 
tranverse fluctuations [36, 44] etc. The literature on the subject is huge (for a 
review see Ref. [45]). 

Here we will focus simply at zero-temperature and a lattice version of the 
directed polymer problem. This version can be stated as in Fig. Consider a 
square lattice with O denoting the origin. On each site with coordinates 
of this lattice, there is a random energy eij, drawn independently from site to 
site, but from the identical distribution p(e). For simplicity, we will consider that 
Cij's are all negative, i.e., p(e) has support only over e S [0, — oo]. The energy 
variables eij's are quenched random variables. 

We are interested here only in directed walks for simplicity. Consider all pos- 
sible directed walk configurations (a walk that can move only north or eastward 
as shown in Fig. |5]l that start from the origin O and end up at a fixed point, say 
P with co-ordinates (x, y). An example of such a walk is shown in Fig. [5] The 
total energy E(W) of any given walk W from O to P is just the sum of site 
energies along the path W, E(W) = YlieW e »- Thus, for fixed O and P (the 
endpoints), the energy of a path varies from one path to another (all having the 
same endpoints O and P). The path having the minimum energy (optimal path) 
among these will correspond to the ground state configuration, i.e., the polymer 
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will prefer to choose this optimal path at zero temperature. Let Eo(x,y) de- 
note this minimum energy amongst all directed paths that start at O and finish at 
P : (x,y). Now, this minimum energy Eq(x, y) is, of course, a random variable 
since it fluctuates from one configuration of quenched disorder to another. One 
is interested in the statistics of Eq(x, y) for a given fixed (x, y). For example, 
what is the probability distribution of Eq(x, y) given that e XjV 's are independent 
and identically distributed random variables each drawn from p(e)? 

Mathematically, one can write an 'evolution' equation or recursion relation 
for the variable Eq(x, y). Indeed, the path that ends up at say (x, y), must have 
visited either the site (x — 1, y) or the site (a;, y — 1) at the previous step. Then 
clearly, 

Eo(x,y) = mm[E {x - 1, y), E {x, y - 1)] + e x , y (5.1) 

where e XjV denotes the random energy associated with the site (x, y). Alternately, 
we can define H(x. y) = —Eq{x, y) which are all positive variables that satisfy 
the recursion relation 



where £ XjV — —e XjV are positive random variables. The recursion relation in 
Eq. ( 15.21 ) is non-linear and hence is difficult to find the distribution of H(x, y), 
knowing the distribution of the £ x .y's- Note that, by interpreting t — x + y as a 
time-like variable, and denoting by i the transverse coordinate at a fixed t, this 
recursion relation can also be interpreted as a stochastic evolution equation, 



where the site energy ^ t can now be interpreted as a stochastic noise. In this 
interpretation, one can think of the directed polymer as a growing model of (1+1) 
dimensional interface where H(i,t) denotes the height of the interface at the 
site i of a one dimensional lattice at time t. Only, in this version, the length 
of one dimensional lattice or the substrate keeps increasing linearly with time t. 
In this respect, it corresponds to a special version of a polynuclear growth model 
where growth occurs on top of a single droplet whose linear size keeps increasing 
uniformly with time. There are, of course, several other variations of this simple 
directed polymer model [45]. For example, one can consider a version where 
the random energies are associated with bonds, rather than the sites. Similarly, 
one can consider a finite temperature version of the model. In the corresponding 
analogy to the interface model, at finite temperature, the free energy (as opposed 
to the ground state energy) of the polymer corresponds to the height variable 
of the interface. This is most easily seen in the continuum formulation of the 



H (x, y) — max [H [x 



l,y),H(x,y-l)]+C 



(5.2) 



H(i,t) = max [H(i + l,t- l),H(i - l,t - 1)] + &, t 



(5.3) 
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model by writing down the partition function as a path integral and then showing 
directly that H = In Z satisfies the KPZ equation [46]. 

A lot is known about the first and the second moment of H (x, y) (or al- 
ternatively for H(i,t) in the height language) and the associated universality 
properties [40,41,47]. For example, from simple extensivity properties, one 
would expect that average ground state energy of the path will increase linearly 
with the size (number of steps t) of the path. In terms of height, this means 
(H(i,t)) — > v(i)t for large t where v(i) is velocity of the interface at site i 
of the one dimensional lattice [48]. Also, the standard deviation of height, say 
of H(x,x) (along the diagonal), is known to grow universally, for large x as 
x 1 / 3 [45]. For the interface, this means that the typical height fluctuation grows 
as i 1 / 3 for large t, a result that is known from the KPZ problem in 1-dimension 
(via a mapping to the noisy Burgers equation). However, much less was known 
about the full distribution of H(x, y), till only recently. 

Johansson [3] was able to derive the full asymptotic distribution of H(x,y) 
evolving via Eq. ( 15.21 ) for a specific disorder distribution, where the noise £, x ,y's> 
in Eq. ( 15.21 ) are i.i.d variables taking nonnegative integer values according to the 
distribution: Prob(^ )y = k) = (1 — p)p k for k = 0, 1, 2, . . ., where < p < 1 
is a fraction. Interestingly, exactly the same recursion relation as in Eq. ( 15.21 ) and 
also with the same disorder distribution as in Johansson's model also appeared 
independently around the same time in an anisotropic directed percolation prob- 
lem studied by Rajesh and Dhar [49], a problem to which we will come back 
later when we discuss the sequence matching problem. The authors in Ref. [49] 
were able to compute exactly the first moment, but Johansson computed the full 
asymptotic distribution. He showed that for large x and y [3] 



where q = 1 — p, x is a random variable with the Tracy-Widom distribution, 
Prob(x < x) = F 2 (x) as in Eq. j4.2t , If one sets x = y = t/2, then for the 
growing droplet interpretation, it would mean that the height H (i = 0, t) has a 
mean that grows linearly with t and a standard deviation that grows as i 1 / 3 and 
when properly centered and scaled, the distribution of H (0, t) tends to the GUE 
Tracy-Widom distribution. Around the same time, Prahofer and Spohn derived a 
similar result for a class of PNG models [5]. Moreover, they were able to show 
that not just the ^(x), but other Tracy-Widom distributions such as the 
(corresponding to the GOE ensemble) also arises in the PNG model when one 
starts from different initial conditions [5]. 



H(x,y) 
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5.1. Exact Height Distribution in A Ballistic Deposition Model 

In this subsection, we will show explicitly how one can derive the exact height 
distribution in a specific (1 + 1) dimensional growth model and show that it has 
a limiting Tracy- Widom distribution. This example will illustrate explicitly how 
one maps a growth model to the LIS problem [7]. A similar mapping was used by 
Prahofer and Spohn for the PNG model [5]. But before we illustrate the mapping, 
it is useful to remark (i) why one studies such growth models and (ii) what does 
this mapping and subsequent calculation of the height distribution achieve? 

The answer to these two questions are as follows. We know that growth pro- 
cesses are ubiquitous in nature. The past few decades have seen extensive re- 
search on a wide variety of both discrete and contiuous growth models [45,50, 
51]. A large class of these growth models in (1 + 1) dimensions such as the Eden 
model [52], restricted solid on solid (RSOS) models [53], directed polymers as 
mentioned before [45], polynuclear growth models (PNG) [54] and ballistic de- 
position models (BD) [55] are believed to belong to the same universality class 
as that of the Kardar-Parisi-Zhang (KPZ) equation describing the growth of inter- 
face fluctuations [37]. This universality is, however, somewhat restricted in the 
sense that it refers only to the width or the second moment of the height fluctua- 
tions characterized by two independent exponents (the growth exponent j3 and the 
dynamical exponent z) and the associated scaling function. Moreover, even this 
restricted universality is established mostly numerically. Only in very few spe- 
cial discrete models in (1 + 1) dimensions, the exponents /3 = 1/3 and z = 3/2 
can be computed exactly via the Bethe ansatz technique [57]. A natural and im- 
portant question is whether this universality can be extended beyond the second 
moment of height fluctuations. For example, is the full distribution of the height 
fluctuations (suitably scaled) universal, i.e. is the same for different growth mod- 
els belonging to the KPZ class? Moreover, the KPZ-type equations are usually 
attributed to models with small gradients in the height profile and the question 
whether the models with large gradients (such as the BD models) belong to the 
KPZ universality class is still open. The connection between the discrete BD 
models and the continuum KPZ equation has recently been elucidated upon [58]. 

To test whether this more stringent test of universality (going beyond the sec- 
ond moment) of the full distribution is true or not, one needs to calculate the full 
height distribution in different models which are known to belong to the KPZ 
universality class as far as only the second moment is concerned. In fact, as men- 
tioned earlier, Prahofer and Spohn were able to calculate the asymptotic height 
distribution in a class of PNG models and showed that it has the Tracy-Widom 
distribution [5]. Similarly, we mentioneed earlier that Johansson [3] established 
rigorously that the height distribution, in a specific version of the directed poly- 
mer model, is of the Tracy-Widom form. Subsequently, there have been several 
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other works [6] recently, including the ballistic deposition model [7] that we will 
discuss below, that showed that indeed all these (1 + 1) dimensional growth 
models share the same common scaled height distribution (Tracy-Widom), thus 
putting the universality on a much stronger footing going beyond just the second 
moment. 

We now focus on a specific ballistic deposition model. Ballistic deposition 
models typically try to mimic columnar growth that occur in many natural sys- 
tems and have been studied extensively in the past with a variety of microscopic 
rules [55, 56], though an exact calculation of the height distribution remained 
elusive in any of these microscopic models. In collaboration with S. Nechaev, 
we found a particular ballistic deposition model which can be explicitly mapped 
to the LIS problem and hence the full asymptotic height distribution can be 
computed exactly [7]. In our (1 + 1)-D (here D stands for 'dimensional') BD 
model columnar growth occurs sequentially on a linear substrate consisting of L 
columns with free boundary conditions. The time t is discrete and is increased 
by 1 with every deposition event. We first consider the flat initial condition, i.e., 
an empty substrate at t = 0. Other initial conditions will be treated later. At any 
stage of the growth, a column (say the fc-th column) is chosen at random with 
probability p = j- and a "brick" is deposited there which increases the height of 
this column by one unit, Hk — ► Hk + 1. Once this "brick" is deposited, it screens 
all the sites at the same level in all the columns to its right from future deposition, 
i.e. the heights at all the columns to the right of the fc-th column must be strictly 
greater than or equal to Hk + 1 at all subsequent times. For example, in Fig. [6] 
the first brick (denoted by 1) gets deposited at t = 1 in the 4-th column and it im- 
mediately screens all the sites to its right. Then the second brick (denoted by 2) 
gets deposited at t = 2 again in the same 4-th column whose height now becomes 
2 and thus the heights of all the columns to the right of the 4-th column must be 
> 2 at all subsequent times and so on. Formally such growth is implemented by 
the following update rule. If the fc-th site is chosen at time t for deposition, then 

H k (t + l)=max{H k (t),H k _ 1 (t),...,H 1 {t)} + l. (5.5) 

The model is anisotropic and evidently even the average height profile (Hk(t)) 
depends nontrivially on both the column number fc and time t. Our goal is to 
compute the asymptotic height distribution Pk (H, t) for large t. 

It is easy to find the height distribution Pi (H, t) of the first column, since 
the height there does not depend on any other column. At any stage, the height 
in the first column either increases by one unit with probability p = (if this 
column is selected for deposit) or stays the same with probability 1 — p. Thus 
Pi(H,t) is simply the binomial distribution, Pi(H,t) = (#)p' l (l -pY~ H with 
H < t. The average height of the first column thus increases as (Hi(t)) = pt 
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Fig. 6. Growth of a heap with asymmetric long-range interaction. The numbers inside cells show the 
times at which the blocks are added to the heap. 



for all t and its variance is given by <Ji(t) — tp(l — p). While the first column 
is thus trivial, the dynamics of heights in other columns is nontrivial due to the 
right-handed infinite range interactions between the columns. For convenience, 
we subsequently measure the height of any other column with respect to the first 
one. Namely, by height hk (t) we mean the height difference between the (fc + 1)- 
th column and the first one, hkif) — Hk+\{t) — -Hi(i), so that ho(t) = for all 
t. 

To make progress for columns k > 0, we first consider a (2+l)-D construction 
of the heap as shown in Fig. [7] by adding an extra dimension indicating the time 
t. In Fig. |7l the x axis denotes the column number, the y axis stands for the time 
t and the z axis is the height h. In this figure, every time a new block is added, 
it "wets" all the sites at the same level to its "east" (along the x axis) and to its 
"north" (along the time axis). Here "wetting" means "screening" from further 
deposition at those sites at the same level. This (2 + 1)-D system of "terraces" is 
in one-to-one correspondence with the (1 + 1)-D heap in Fig. [6] This construction 
is reminiscent of the 3D anisotropic directed percolation (ADP) problem studied 
by Rajesh and Dhar [49]. Note however, that unlike the ADP problem, in our 
case each row labelled by t can contain only one deposition event. 

The next step is to consider the projection onto the 2D (x, y)-plane of the level 
lines separating the adjacent terraces whose heights differ by 1 . In this projection, 
some of the level lines may overlap partially on the plane. To avoid the overlap 
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Fig. 7. (2 + 1) dimensional "terraces" corresponding to the growth of a heap in Fig. [6] 

for better visual purposes, we make a shift (x, y) — > (x + h(x, y),y) and repre- 
sent these shifted directed lines on the 2D plane in Fig. [8] The black dots in Fig. 
|8]denote the points where the deposition events took place and the integer next to 
a dot denotes the time of this event. Note that each row in Fig. [8]contains a single 
black dot, i.e., only one deposition per unit of time can occur. In Fig. [8] there are 
8 such events whose deposition times form the sequence {1, 2, 3, 4, 5, 6, 7, 8} of 
length N = 8. Now let us read the deposition times of the dots sequentially, but 
now column by column and vertically from top to bottom in each column, start- 
ing from the leftmost one. Then this sequence reads {8, 3, 5, 1, 2, 6, 4, 7} which 
is just a permutation of the original sequence {1, 2, 3, 4, 5, 6, 7, 8}. In the per- 
muted sequence {8, 3, 5, 1, 2, 6, 4, 7} there are 3 LIS's: {3, 5, 6, 7}, {1, 2, 6, 7} 
and {1, 2, 4, 7}, all of the same length In =4. As mentioned before (see Fig. 
0), this is precisely the number of piles in the patience sorting of the permutation 



Let us note one immediate fact from Fig. [8] The numbers belonging to the 
different level lines in Fig. [8] are in one-to-one correspondence with the piles 
[{8, 3, 1}, {5, 2}, {6, 4}, {7}] in Aldous-Diaconis patience sorting game. Hence, 
each pile can be identified with an unique level line. Now, the height h(x, t) at 
any given point (x, t) in Fig. |8]is equal to the number of level lines inside the 
rectangle bounded by the corners: [0, 0], [x, 0], [0, t], [x, t]. Thus, we have the 
correspondence: height = number of level lines = number of piles = length l n 
of the LIS. However, to compute l n , we need to know the value of n which is 
precisely the number of black dots inside this rectangle. 

Once the problem is reduced to finding the number of black dots or deposition 
events, we no longer need the Fig. [8] (as it may confuse due to the visual shift 
(x, y) — > (x + h(x, y), y)) and can go back to Fig. [7] where the north-to-east 
corners play the same role as the black dots in Fig. [7] In Fig. [7] to determine 
the height hkif) of the fc-th column at time f, we need to know the number of 




123456789 
column number, k 



{8,3,5,1,2,6,4,7}. 
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Fig. 8. The directed lines are the level lines separating adjacent terraces with height diffrence 1 in 
Fig. 2, projected onto the (x,y) plane and shifted by (x,y) — > (x + h(x,y),y) to avoid partial 
overlap. The black dots denote the deposition events. The numbers next to the dots denote the times 
of those deposition events. 



deposition events inside the 2D plane rectangle t bounded by the four corners 
[0, 0], [k, 0], [0, t], [k, t}. Let us begin with the last column k = L. For k = L 
the number of deposition events N in the rectangle Rh,t is equal to the time t 
because there is only one deposition event per time. In our example N = t = 8. 
For a general k < L the number of deposition events N inside the rectangle 
Rk,t is a random variable, since some of the rows inside the rectangle may not 
contain a north-to-east corner or a deposition event. The probability distribution 
Pk,t{N) (for a given [k, t]) of this random variable can, however, be easily found 
as follows. At each step of deposition, a column is chosen at random from any 
of the L columns. Thus, the probability that a north-to-east corner will fall on 
the segment of line [0, k] (where k < L) is equal XakjL. The deposition events 
are completely independent of each other, indicating the absence of correlations 
between different rows labelled by t in Fig. [7] So, we are asking the question: 
given t rows, what is the probability that N of them will contain a north-to-east 
corner? This is simply given by the binomial distribution 



PkA N ) = 




(5.6) 



where N <t. Now we are reduced to the following problem: given a sequence of 
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integers of length N (where N itself is random and is taken from the distribution 
in Eq. ( 15.6b ). what is the length of the LIS? Recall that this length is precisely the 
height hk (t) of the fc-th column at time t in our model. In the thermodynamic 
limit L — > oo for t > 1 and any fixed k such that the quotient A = ^ remains 
fixed but is arbitrary, the distribution in Eq. ( 15.6b becomes a Poisson distribution 
P(N) — > e~ A ^-, with the mean A = ^. We can then directly use the BDJ 
result in Eq. (14.3b to predict our main result for the height in the BD model, 



for large X = tk /L, where the random variable \ nas the Tracy-Widom distribu- 
tion i*2(x) as m Eq. (I4.21 i. Using the known exact value (x) = —1.7711... from 
the Tracy-Widom distribution [1], we find exactly the asymptotic average height 
profile in the BD model, 



The leading square root dependence of the profile on the column number k has 
been seen numerically. Eq. ( 15.81 ) also predicts an exact sub-leading term with 
A: 1 / 6 dependence. Similarly, for the variance, u^(t) = {[hk{t) — (ft-fc(i))] 2 ), we 

find asymptotically: a 2 k {t) -► c (y;) 1 ^ , where c = <[x - (x)] 2 ) = 0.8132... 
[1]. Eliminating the t dependence for large t between the average and the vari- 
ance, we get, a1(t) w a{hk(t)) 2/3 where the constant a = co/2 2 / 3 = 0.51228... 
and f3 — 1/3, thus recovering the KPZ scaling exponent. In addition to the BD 
model with infinite range right-handed interaction reported here, we have also 
analyzed the model (analytically within a mean field theory and numerically) 
when the right-handed interaction is short ranged. Somewhat suurprisingly and 
pleasantly, we found that the asymptotic average height profile is independent of 
the range of interaction. A recent analysis of the short range BD model sheds 
light on this fact [59]. 

So far, we have demonstrated that for a flat initial condition, the height fluctu- 
ations in the BD model follow the Tracy-Widom distribution Fgue{x) which 
corresponds to the distribution of the largest eigenvalue of a random matrix 
drawn from a Gaussian unitary ensemble. In the context of the PNG model, 
Prahofer and Spohn [5] have shown that while the height fluctuations of a sin- 
gle PNG droplet follow the distribution Fgve(x), it is possible to obtain other 
types of universal distributions as well. For example, the height fluctuations in 
the PNG model growing over a flat substrate follow the distribution Fqoe(x) 
where Fqqb(x) is the distribution of the largest eigenvalue of a random matrix 




(5.7) 




(5.8) 
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drawn from the Gaussian orthogonal ensemble. Besides, in a PNG droplet with 
two external sources at its edges which nucleate with rates p + and p_, the height 
fluctuations have different distributions depending on the values of p + and p_ . 
For p + < 1 and p_ < 1, one gets back the distribution Fque(x). If how- 
ever p + — 1 and p_ < 1 (or alternatively p_ = 1 and p + < 1), one gets the 
distribution Fq OE (x) which corresponds to the distribution of the largest of the 
superimposed eigenvalues of two independent GOE matrices. In the critical case 
p + = 1 and p_ = 1, one gets a new distribution Fq(x) which does not have any 
random matrix analogy. For p + > 1 and p_ > 1, one gets Gaussian distribu- 
tion. These results for the PNG model were obtained in Ref. [5] using a powerful 
theorem of Baik and Rains [4]. 

The question naturally arises as to whether these other distributions, apart 
from the Fqtje(x), can also appear in the BD model considered in this paper. 
Indeed, they do. For example, if one starts with a staircase initial condition 
hk(0) = k for the heights in the BD model, one gets the distribution Fq OE (x) 
for the scaled variable \. This follows from the fact that for the staircase initial 
condition, in Fig. 2 there will be a black dot (or a north-to-east corner) at every 
value of k on the k axis at t = 0. Thus the black dots appear on the k axis with 
unit density. This corresponds to the case p + — 1 and p_ = of the general 
results of Baik and Rains which leads to a Fq OE (x) distribution. Of course, the 
density p + can be tuned between and 1, by tuning the average slope of the 
staircase. For a generic < p + < 1, one can also vary p_ by putting an external 
source at the first column. Thus one can obtain, in principle, most of the distri- 
butions discussed in Ref. [4] by varying p + and p_ . Note that the case p_ = 1 
(external source which drops one particle at the first column at every time step) 
and p + = (flat substrate) is, however, trivial since the surface then remains 
flat at all times and the height just increases by one unit at every time step. The 
distribution Fqoe (x) is, however, not naturally accessible within the rules of our 
model. 



6. Sequence Matching Problem 

In this section, I will discuss a different problem namely that of the alignment 
of two random sequences and will illustrate how the Tracy- Widom distribution 
appears in this problem. This is based on a joint wotk with S. Nechaev [12]. 

Sequence alignment is one of the most useful quantitative methods used in 
evolutionary molecular biology [60-62]. The goal of an alignment algorithm is 
to search for similarities in patterns in different sequences. A classic and much 
studied alignment problem is the so called 'longest common subsequence' (LCS) 
problem. The input to this problem is a pair of sequences a = {ai,a 2 , ■ ■ ■ , on} 
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a: 



A C G C T 
C T G "A C 




A 



C 



No. of matches in matching with solid lines = 4 
No. of matches in matching with dashed lines = 2 



Fig. 9. Two fixed sequences a : {A, C, G, C, T, A, C} and /3 : {C,T,G,A,C}. The solid 
lines show the common subsequence {C, G, A, C} and the dashed lines denote another common 
subsequence {A, C}. 



(of length i) and (3 = {(3i, (3%, . . . , /3j} (of length j). For example, a and (3 can 
be two random sequences of the 4 base pairs A, C, G, T of a DNA molecule, 
e.g., a = {A,C,G,C,T,A,C} and /? = {C, T, G, A, C}. A subsequence 
of a is an ordered sublist of a (entries of which need not be consecutive in 
a), e.g, {C, G, T, C}, but not {T, G, C}. A common subsequence of two se- 
quences a and (3 is a subsequence of both of them. For example, the subse- 
quence {C, G, A, C} is a common subsequence of both a and (3. There can be 
many possible common subsequences of a pair of sequences. For example, an- 
other common subsequence of a and (3 is {A, C}. One simple way to construct 
different common subsequences (for two fixed sequences a and (3) is by drawing 
lines from one member of the set a to another member of the set (3 such that 
the lines can not cross. For example, the common subsequence {C, G, A, C} 
is shown by solid lines in Fig. [9] On the other hand the common subsequence 
{A, C} is shown by the dashed lines in Fig. [9] The aim of the LCS problem is 
to find the longest of such common subsequences between two fixed sequences 
a and (3. 

This problem and its variants have been widely studied in biology [63-66], 
computer science [61,67-69], probability theory [70-75] and more recently in 
statistical physics [76-78]. A particularly important application of the LCS prob- 
lem is to quantify the closeness between two DNA sequences. In evolutionary 
biology, the genes responsible for building specific proteins evolve with time and 
by finding the LCS of the same gene in different species, one can learn what has 
been conserved in time. Also, when a new DNA molecule is sequenced in vitro, it 
is important to know whether it is really new or it already exists. This is achieved 
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quantitatively by measuring the LCS of the new molecule with another existing 
already in the database. 

For a pair of fixed sequences of length i and j respectively, the length Lij of 
their LCS is just a number. However, in the stochastic version of the LCS prob- 
lem one compares two random sequences drawn from c alphabets and hence the 
length Lij is a random variable. A major challenge over the last three decades 
has been to determine the statistics of L it j [70-74]. For equally long sequences 
(i = j = n), it has been proved that (£„,„) ~ j c n for n» 1, where the averag- 
ing is performed over all realizations of the random sequences. The constant j c 
is known as the Chvatal-Sankoff constant which, to date, remains undetermined 
though there exists several bounds [71,73,74], a conjecture due to Steele [72] 
that 7 C = 2/(1 + y/c) and a recent proof [75] that -f c — > 2/y/c as c — > oo. 
Unfortunately, no exact results are available for the finite size corrections to the 
leading behavior of the average (L„.„), for the variance, and also for the full 
probability distribution of L n , n . Thus, despite tremendous analytical and nu- 
merical efforts, exact solution of the random LCS problem has, so far, remained 
elusive. Therefore it is important to find other variants of this LCS problem that 
may be analytically tractable. 

Computationally, the easiest way to determine the length Lij of the LCS of 
two arbitrary sequences of lengths i and j (in polynomial time <~ 0(ij)) is via 
using the recursive algorithm [61,78] 

= max Lij-i, Lj_i ;J _i + rjij] , (6.1) 

subject to the initial conditions Lj ; o = Lqj = Lo,o = 0. The variable r/ij is 
either 1 when the characters at the positions i (in the sequence a) and j (in the 
sequence /?) match each other, or if they do not. Note that the variables rjij's are 
not independent of each other. To see this consider the simple example - match- 
ing of two strings a = AB and (3 — AA. One has by definition: 7714 = 771,2 = 1 
and 772,1 = 0. The knowledge of these three variables is sufficient to predict that 
the last two letters will not match, i.e., 772.2 = 0. Thus, 772,2 can not take its 
value independently of 7714, 771,2, 772,1- These residual correlations between the 
Tji j variables make the LCS problem rather complicated. Note however that for 
two random sequences drawn from c alphabets, these correlations between the 
r]ij variables vanish in the c — > 00 limit. 

A natural question is how important are these correlations between the rjij 
variables, e.g., do they affect the asymptotic statistics of Lij's for large i and 
j? Is the problem solvable if one ignores these correlations? These questions 
naturally lead to the Bernoulli matching (BM) model which is a simpler variant 
of the original LCS problem where one ignores the correlations between rjij 's 
for all c [78]. The length LfJ 1 of the BM model satisfies the same recursion 
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relation in Eq. ( 16. Il l except that r?ij 's are now independent and each drawn from 
the bimodal distribution: p(rf) = {l/c)8 v> \ + (l — l/c)S n fl. This approximation is 
expected to be exact only in the appropriately taken c — > oo limit. Nevertheless, 
for finite c, the results on the BM model can serve as a useful benchmark for 
the original LCS model to decide if indeed the correlations between Tj^j's are 
important or not. Unfortunately, even in the absence of correlations, the exact 
aymptotic distribution of Lf AI in the BM model has so far remained elusive, 
mainly due to the nonlinear nature of the recursion relation in Eq. ( 16.11 ). The 
purpose of this Rapid Communication is to present an exact asymptotic formula 
for the distribution of the length L^ A n ! in the BM model for all c. So far, only the 
leading asymptotic behavior of the average length in the BM model is known [78] 
using the 'cavity' method of spin glass physics [79], 

ItBM\ ^ „,BA'T 

\ L n,n ) ~ 7 C n (6.2) 

where = 2/(1 + s/c), same as the conjectured value of the Chvatal-Sankoff 
constant 7 C for the original LCS model. However, other properties such as the 
variance or the distribution of L^ M remained untractable even in the BM model. 
We have shown [12], as illustrated below, that for large n, 

L™->^ M n + f(c)n^ X (6.3) 

where \ is a random variable with a n-independent distribution, Prob(x < x) = 
F2(x) which is precisely the Tracy-Widom distribution in Eq. (14.2b . Indeed, we 
were also able to compute the functional form of the scale factor /(c) exactly for 
all c [12], 

This allows us to calculate the average including the subleading finite size cor- 
rection term and the variance of L„ „ for large n, 



(<X 2 )-(X) 2 ) / 2 (c)n 2 / 3 , (6.5) 



where one can use the known exact values [1], (x) = —1.7711 . . . and (x 2 ) — 
(x)' 2 = 0.8132 . . .. These exact results thus invalidate the previous attempt [78] 
to fit the subleading correction to the mean in the BM model with a n 1 / 2 /ln(n) 
behavior and also to fit the scaled distribution with a Gaussian form. Note that the 
recursion relation in Eq. d6.lt can also be viewed as a (1+1) dimensional directed 
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polymer problem [77,78] and some asymptotic results (such as the 0(n 2 / 3 ) be- 
havior of the variance of L n>n for large n) can be obtained using the arguments 
of universality [77]. However, this does not provide precise results for the full 
distribution along with the correct scale factors that are obtained here. 

It is useful to provide a synopsis of our method in deriving these results. First, 
we prove the results in the c — * oo limit, by using mappings to other models. To 
make progress for finite c, we first map the BM model exactly to a 3-d anisotropic 
directed percolation (ADP) model first studied by Rajesh and Dhar [49]. This 
ADP model is also precisely the same as the directed polymer model studied by 
Johansson [3], as discussed in the previous section and for which the exact results 
are known as in Eq. ( 15.41 ). To extract the results for the BM model from those of 
Johansson's model, we use a simple symmetry argument which then allows us to 
derive our main results in Eqs. ( l6.3b - do*3l ) for all c. As a check, we recover the 
c — > oo limit result obtained independently by the first method. 

In the BM model, the length Lfj can be interpreted as the height of a surface 
over the 2 dimensional plane constructed via the recursion relation in Eq. 
d6.lt . A typical surface, shown in Fig. [I0](a), has terrace-like structures. 

It is useful to consider the projection of the level lines separating the adjacent 
terraces whose heights differ by 1 (see FigfTTb onto the 2-D plane. Note 
that, by the rule Eq. d6.1) . these level lines never overlap each other, i.e., no 
two paths have any common edge. The statistical weight of such a projected 2-D 
configuration is the product of weights associated with the vertices of the 2-D 
plane. There are five types of possible vertices with nonzero weights as shown 
in Fig. [TT] where p = 1/c and q = 1 — p. Since the level lines never cross each 
other, the weight of the first vertex in Fig. [TT]is 0. 

Consider first the limit c — > oo (i.e., p — > 0). The weights of all allowed 
vertices are 1, except the ones shown by black dots in Fig. [TT] whose associated 
weights are p — > 0. The number N of these black dots inside a rectangle of 
area A = ij can be easily estimated. For large A and p — > 0, this number is 



Random Matrices, the Ulam Problem, Directed Polymers & Growth Models, and Sequence MatchiB^ 



10 
9 
8 
7 

' 6 
5 

[ 4 
3 
2 
1 




0123456789 10 



x (space) 



+ 









q 






1 






1 




L 


p 


1 




1 



vertex weights 



Fig. 11. Projected 2-d level lines separating adjacent terraces of unit height difference in the BM 
surface in Fig l 101 (a). The adjacent table shows the weights of all vertices on the 2-d plane. 



clearly Poisson distributed with the mean N = pA. The height Lf^ 1 is just 
the number of level lines Af inside this rectangle of area A = ij. One can easily 
estimate Af by following precisely the method outlined in the previous subsection 
in the context of the ballistic deposition model. Following the same analysis as 
in the ballistic deposition model, it is easy to see that the number of level lines Af 
inside the rectangle (for large A), appropriately scaled, has a limiting behavior, 

TV -> 2\/N + Nx, where X is a random variable with the Tracy-Widom 
distribution. Using N — pA = ij/c, one then obtains in the limit p — > 0, 

In particular, for large equal length sequences i = j = n, we get for c — > oo 

Lf^^^n + c-^n^x. (6.7) 

For finite c, while the above mapping to the LIS problem still works, the corre- 
sponding permutations of the LIS problem are not generated with equal proba- 
bility and hence one can no longer use the BDJ results. 

For any finite c, we can however map the BM model to the ADP model studied 
by Rajesh and Dhar [49]. In the ADP model on a simple cubic lattice the bonds 
are occupied with probabilities p x , p y , andp 2 along the x, y and z axes and are all 
directed towards increasing coordinates. Imagine a source of fluid at the origin 
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which spreads along the occupied directed bonds. The sites that get wet by the 
fluid form a 3-d cluster. In the ADP problem, the bond occupation probabilities 
are anisotropic, p x = p y = 1 (all bonds aligned along the x and y axes are 
occupied) and p z = p. Hence, if the point (x, y, z) gets wet by the fluid then 
all the points (x 1 , y', z) on the same plane with x' > x and y' > y also get wet. 
Such a wet cluster is compact and can be characterized by its bounding surface 
height H(x, y) as shown in Fig. (lb). It is not difficult to see [49] that the height 
H (x, y) satisfies exactly the same recursion relation of the directed polymer as 
in Eq. (15.21 ) where £x, v '$ are i.i.d. random variables taking nonnegative integer 
values with Probf^ = k) = (1 - p)p k for k = 0, 1, 2, . . .. Thus the ADP 
model of Rajesh and Dhar is precisely identical to the directed polymer model 
studied by Johansson with exactly the same distribution of the noise y). 

While the terrace-like structures of the ADP surface look similar to the BM 
surfaces (compare Figs. (TTUIa) and (fTUlb). there is an important difference be- 
tween the two. In the ADP model, the level lines separating two adjacent ter- 
races can overlap with each other [49], which does not happen in the BM model. 
However, by making the following change of coordinates in the ADP model [49] 

C = x + h(x,y); r) = y + h(x,y) (6.8) 

one gets a configuration of the surface where the level lines no longer overlap. 
Moreover, it is not difficult to show that the projected 2-D configuration of level 
lines of this shifted ADP surface has exactly the same statistical weight as the 
projected 2-D configuration of the BM surface. Denoting the BM height by 
h(x,y) = L^y 1 , one then has the identity, h((,rj) = h(x,y), which holds for 
each configuration. Using Eq. ( 16.81 l. one can rewrite this identity as 

h(C, t}) = h(c- h(C, V), V - KC, t})) ■ (6.9) 

Thus, for any given height function h(x,y) of the ADP model, one can, in 
principle, obtain the corresponding height function h(x,y) for all (x,y) of the 
BM model by solving the nonlinear equation ( 16.9b . This is however very difficult 
in practice. Fortunately, one can make progress for large (x, y) where one can 
replace the integer valued discrete heights by continuous functions h(x, y) and 
h(x,y). Using the notation d x = d/dx it is easy to derive from Eq. ( I6.8I 1 the 
following pair of identities, 

9xh= ^ ^ d yh= ^ (6.10) 

1 — d^h — djjh 1 — d^h — d v h 

In a similar way, one can show that 

d ( h= - ; d v h= I . (6.11) 

l + o x h + o v h l + o x h + o v h 
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We then observe that Eqs. ( 16.101 ) and ( 16. lit are invariant under the simultaneous 
transformations 



( -> -x; rj^ -y; h 



(6.12) 



Since the height is built up by integrating the derivatives, this leads to a simple 
result for large £ and rj, 



h(t,r,) = h(-(,-r)). 



(6.13) 



Thus, if we know exactly the functional form of the ADP surface h(x, y), then 
the functional form of the BM surface h(x,y) for large x and y is simply obtained 
by h(x,y) — h(—x,—y). Changing x — > — x and y — > — y in Johansson's 
expression for the ADP surface in Eq. ( 15.41 ) we thus arrive at our main asymptotic 
result for the BM model 



- BM 
J x,y 



h(x,y) - 
(pxy) 1 / 6 



2^/pxy - p(x + y) 



(1+p) 



[x + y) 



2/3 



(6.14) 
n, Eq. 



where p — 1/c and q = 1 — 1/c. For equal length sequences x = y 
(16.141 ) then reduces to Eq. ( 16.3b . 

To check the consistency of our asymptotic results, we further computed the 
difference between the left- and the right-hand sides of Eq 



Ah((, rj) = h((, rj) -h((- h(C, rj), V - h((, r?)) , 



(6.15) 



with the functions h(x, y) and h(x, y) given respectively by Eqs. ( 15.41 ) and ( 16.141 ). 
For large £ = rj one gets 



AMCC) - P V V/3(1 - VP) 4/3 



-1/3 



(6.16) 



Thus the discrepancy falls off as a power law for large (, indicating that in- 
deed our solution is asymptotically exact. We have also performed numeri- 
cal simulations of the BM model using the recursion relation in Eq. ( 16. U for 
c = 2, 4, 9, 16, 100. Our preliminary results [12] for relatively small system 
sizes (up to n — 5000) are consistent with our exact results in Eqs. d6.3l )-( l63] ). 

Thus, the Tracy- Widom distribution also describes the asymptotic distribution 
of the optimal matching length in the BM model, for all c. Given that the cor- 
relations in the original LCS model become negligible in the c — > oo limit, it is 
likely that the BM asymptotics in Eq. (16.7b would also hold for the original LCS 
model in the c — > oo limit. An important open problem is to determine whether 
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the Tracy- Widom distribution also appears in the LCS problem for finite c. The 
precise distribution obtained here (including exact prefactors) for all c in the BM 
model will serve as a useful benchmark to which future simulations of the LCS 
problem can be compared. 



7. Conclusion 

In these lectures I have discussed 4 a priori unrelated problems and tried to give 
a flavour of the recent developments that have found a deep connection between 
these problems. These connections have now established the fact that they all 
share one common limiting distribution, namely the Tracy-Widom distribution 
that describes the asymptotic distribution law of the largest eigenvalue of a ran- 
dom matrix. I have also discussed the probabilities of large deviations of the 
largest eigenvalue, in the range outside the validity of the Tracy-Widom law. As 
examples, I have demonstrated in detail, in two specfic models a ballistic deposi- 
tion model and a sequence alignment problem, how they can be mapped on to the 
longest increasing subsequence problem and consequently proving the existence 
of the Tracy-Widom distribution in these models. 

There have been many other interesting recent developments in this rather 
broad area encompassing different fields that I did not have the scope to discuss 
in these lectures. There are, of course, plenty of open questions that need to be 
addressed, some of which I mention below. 

Finite size effects in growth models: We have discussed how the Tracy-Widom 
distribution appears as the limiting scaled height distribution in several (1 + 1) 
dimensional growth models that belong to the KPZ universality class of fluctuat- 
ing interfaces. Indeed, for a fluctuating surface with height H (x, t) growing over 
a substrate of infinite size one now believes that at long times t » 1 

H(x,t) = vt + bt 1/3 X (7.1) 

where x is a time-independent random variable with the Tracy-Widom distri- 
bution. The prefactors v (the velocity of the interface) and b are model depen- 
dent, but the distribution of the scaled variable \ — [H ~ vty/bt 1 / 3 is uni- 
versal for large t. The nonuniversal prefactors are often very hard to compute. 
We have shown two examples in these lectures where these prefactors can be 
computed exactly. Note, however, that the result in Eq. ( 17. Il l holds only in 
an infinite system. In any real system with a finite substrate size L, the result 
in Eq. d7.U will hold only in the growing regime of the surface, i.e., when 
1 << t << L z , where z is the dynamical exponent characterizing the surface 
evolution. For example, for the KPZ type of interfaces in (1 + 1) dimensions, 
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z = 3/2. However, when t >> L z , the probability distribution of the height 
fluctuation H — (H) will become time-independent. For example, for (1 + 1) di- 
mensional KPZ surfaces with periodic boundary conditions, it is well known [45] 
that the stationary distribution of the height fluctuation is a simple Gaussian, 
Prob[i7 — (H) — x] oc exp[— x 2 ja^L] where ao is a nonuniversal constant and 
the typical fluctuation scales with the system size as L 1 / 2 . An important open 
question is how does the distribution of the height fluctuation crosses over from 
the Tracy-Widom form to a simple Gaussian form as t becomes bigger than the 
crossover time L z . It would be nice to show this explicitly in any of the simple 
models discussed above. 

A direct connection between the growth models and random matrices: The ex- 
istence of the Tracy-Widom distribution in many of the growth models discussed 
here, such as the polynuclear growth model or the ballistic deposition model, 
rely on the mapping to the LIS problem and subsequently using the BDJ results 
that connect the LIS problem to random matrices. It is certainly desirable to find 
to a direct mapping between the growth models and the largest eigenvalue of a 
random matrix. Recent work by Spohn and collaborators [11] linking the top 
edge of a PNG growth model to Dyson's brownian motion of the eigenvalues of 
a random matrix perhaps provides a clue to this missing link. 

Largest Lyapunov exponent in population dynamics: The Tracy-Widom distri- 
bution and the associated large-deviation function discussed in Section 3 conceiv- 
ably have important applications in several systems where the largest eigenvalue 
controls the spectral properties of the system. Some examples were discussed 
in Section 3. Recently, it has been shown that the statistics of largest eigenvalue 
(the largest Lyapunov exponent) is also of importance in population growth of or- 
ganisms in fluctuating environments [80]. It would be interesting to see if Tracy- 
Widom type distribution functions also appear in these biological problems. 

Sequence matching, directed polymer and vertex models: In the context of the 
sequence matching problem discussed in Section 6, we have demonstrated how 
the statistical weights of the surface generated in the Bernoulli matching model 
of the sequence alignment are exactly identical to that of a 5-vertex model on a 
square lattice (see Fig. fTTI) . This is a useful connection because there are many 
quantities in the 5-vertex models that can be computed exactly by employing 
the Bethe ansatz techniques and subsequently one can use those results for the 
sequence alignment or equivalently for the directed polymer model. Recently, 
in collaboration with K. Mallick and S. Nechaev, we have made some progress 
in these directions [81]. A very interesting open issue is if one can derive the 
Tracy-Widom distribution by using the Bethe ansatz techniques. 

Other issues related to the sequence matching problem: There are also many 
other interesting open questions associated with the sequence matching prob- 
lem. We have shown that the length of the longest matching is Tracy-Widom 
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distributed only in the Bernoulli matching model which is a simpler version of 
the original LCS problem. In the BM model one has ignored certain correlations, 
as we discussed in detail. This approximation is exact in the c — > oo limit, where 
c is the number of different types of alphabets, e.g. for DNA, c = 4. Is this 
approximation good even for finite c? In other words, is the optimal matching 
length in the original LCS problem also Tracy-Widom distributed? It would also 
be interesting if one can make a systematic 1/c expansion of the LCS model, i.e., 
keeping the correlations up to 0(l/c). Numerical simulations the LCS prob- 
lem [82] for binary sequence c = 2 indeed indicates that the standard deiviation 
of the optimal matching length scales as n 1 / 3 where n is the sequence size, as in 
the BM model, the question is if the scaled distribution is also Tracy-Widom or 
not. For the original LCS problem, there is also a curious result due to Bonetto 
and Matzinger [82] that claims that if the value of c for the two sequences are 
not the same (for example, the first sequence may be drawn randomly from 3 
alphabets and the second may be a binary sequence), then the standard deviation 
of the optimal matching length scales as n 1 / 2 for large n, which is rather surpris- 
ing! It would be interesting to study the statistics of optimal matches between 
more than two sequences. Finally, here we have just mentioned the matching of 
random sequences. It would be interesting and important to study the statistics of 
optimal matching lengths between non-random sequences, e.g., when there are 
some correlations between the members of any given sequence. 
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